feat(tools/modal): add Function + Volume capabilities by KillerQueen-Z · Pull Request #55 · BlockRunAI/Franklin

KillerQueen-Z · 2026-05-12T09:00:53Z

Summary

Adds 6 new Modal capabilities for long-running GPU workflows. Pairs with BlockRunAI/blockrun#16 — gateway must be merged + deployed first, this PR is no-op until then.

New capabilities

Tool	Purpose
ModalDeployFunction	Register a long-running Python function (custom pip, GPU, up to 24h). Charges max_timeout × hourly rate upfront.
ModalRunFunction	Trigger a deployed function. Returns run_id; poll for result.
ModalGetFunctionStatus	Poll a run for status/result/error.
ModalCreateVolume	Create persistent storage. $0.20/GB-month, 1mo prepaid.
ModalListVolumes	List caller's volumes.
ModalDeleteVolume	Delete a volume (no refund).

Use case

Closes the gap between the existing 24h-capped Sandbox path (`ModalCreate`) and real ML workflows: fine-tuning, batch jobs with checkpoints, multi-day data pipelines.

Pricing

v1 charges upfront at deploy time using the same hourly tiers as long-task sandbox ($0.10/h CPU → $8/h H100). NO REFUND on early termination — over-allocating `timeout` wastes USDC.

Smart-rebate / actual-usage settlement is Phase B, documented in the gateway team's Notion checklist.

Test plan

Build passes (`npm run build`)
After gateway is deployed: e2e test from Franklin CLI:
- Deploy a trivial CPU function
- Run it with input
- Status returns expected result
- Create + list + delete a volume

No breaking changes

All existing ModalCreate/Exec/Status/Terminate capabilities remain. New capabilities are additive in the `modalCapabilities` array.

This is the first feature/vscode-extension* branch built on top of origin/main directly rather than a stack of cherry-picks. The previous branch had drifted ~500 commits behind main as upstream shipped: - v3.10.0 detached background tasks (Detach tool + franklin task CLI) - v3.9.0 Skills system (SKILL.md loader, registry, bundled grill) - v3.9.1 status bar shows chain + default spend cap raised to $2 - v3.9.2 Kimi K2.6 alignment - v3.9.3 /model picker trim 28 → 23 - v3.9.4 roleplayed JSON tool-calls + V4 Flash / Omni metadata - v3.9.5 Nemotron Omni prose stripping + gpt-image-2 size pin - v3.9.6 reasoning-model TTFB defaults + long-task guidance - v3.8.40 i2i timeout (#19) + configurable spend cap (#20) — already our PRs, now confirmed merged - v3.8.41 smart timeout recovery (#26) - v3.8.42 default spend cap $0.25 → $1.00 (#28) - v3.8.43 proxy: per-request timeout + payment-aware fallback (#31) - #34 SKILL.md skills loader - #35 first-class Wallet tool Cherry-picking each onto the old branch would have produced a wall of no-op-content / phantom conflicts (the cherry-picks didn't share commit hashes with main even though their content matched). Instead this branch starts from origin/main and re-applies only the bits that are genuinely extension-specific: - vscode-extension/ (entire directory — webview app, build, README, mascot images, VSIX assets) - src/api/vscode-session.ts (new file: extension-host session helper) - src/commands/config.ts (added default-image-model + default-video- model keys; exported saveConfig for the settings popover; kept main's $1 default comment + max-turn-spend-usd key) - src/agent/streaming-executor.ts (added ImageGen / VideoGen case to inputPreview so timeline shows model) - src/commands/doctor.ts (export runChecks so vscode-session can re-export it as runDoctorChecks) - package.json (./vscode-session export — alongside ./wallet, etc.) Bumps vscode-extension to 0.5.0 (was 0.4.5). Also adds vscode-extension/ *.vsix to .gitignore — packaged builds shouldn't be tracked. Old feature/vscode-extension preserved at backup/vscode-extension-pre-sync.

Mirror of upstream PR #36 (fix/savings-includes-media-cost). The "Saved vs Opus" panel hero would show negative dollar amounts as soon as a user spent meaningfully on ImageGen / VideoGen, e.g. $-8.79 You spent $20.4896 instead of $11.70 Root cause: getStatsSummary() compared an Opus-token baseline (chat only — image/video log inputTokens=0/outputTokens=0) against totalCostUsd (chat + media combined), so once media spend exceeded the chat-vs-Opus delta the difference flipped negative. Fix: split byModel into chatOnlyCost (rows with tokens) and mediaCost (rows without). opusCost on the display side now equals opusChatCost + mediaCost so "you spent X instead of Y" stays apples-to-apples; saved = max(0, opusChatCost - chatOnlyCost) is the chat-side delta only and is clamped non-negative. Bumps vscode-extension to 0.5.1; updates README changelog.

…ion-v0.5 # Conflicts: # src/panel/html.ts # src/stats/tracker.ts

…ion-v0.5

…+ history rename/delete + Detach cwd fix + insights category breakdown + wallet QR + GPU sandbox panel + session import + rate-limit toast This branch preserves the in-progress parallel image/video generation feature (concurrent: 'batch' + askUser merge + walletReservation + runBatchPool) which the v0.5 extension branch will revert until it's been validated end-to-end. Companion features intentionally kept on v0.5: - Modal sandbox tools (use walletReservation but only from Modal, not media gen) - Detach cwd-resolution fix - History rename/delete - Wallet QR popover - Tasks + GPU Sandboxes overlay panels - Session import (Claude Code / Codex) - Rate-limit friendly toast - Insights By Category breakdown Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… feature/parallel-media-gen The cherry-pick brought in everything from the WIP branch including the in-progress parallel image/video pipeline (concurrent: 'batch', batch preflight, askUser mutex, walletReservation, batch-concurrency config, settings UI). That pipeline hasn't been validated end-to-end yet, so it's deferred to feature/parallel-media-gen until ready. What's KEPT on v0.5 (validated, ship-ready): - Modal sandbox tools (ModalCreate/Exec/Status/Terminate) + GPU sandbox panel - Detach cwd-resolution bug fix + 4-strategy fallback - History rename / delete with inline confirm UI - Wallet QR popover (chain-aware EIP-681 / Solana Pay) - Tasks overlay + badge - Session import (Claude Code / Codex) - Rate-limit friendly toast - Insights By Category breakdown (chat/media/sandbox) - Image gen: response_format strip for gpt-image-* family + verbose error diagnostics + async polling for slow models - Default-image-model / default-video-model config consultation - Defensive sanitizeOutgoingMessages in llm.ts - Modal tool exempt from 3-failure auto-disable in tool-guard - Settings popover refresh + obsolete max-turn-spend-usd auto-strip What's REMOVED (now only on feature/parallel-media-gen): - 'batch' concurrent mode in CapabilityHandler - BATCH_CONCURRENCY pool + runBatchPool in streaming-executor - preflightBatch + askUserChain mutex + batchPreApproved Set - skipAskUser in ExecutionScope - walletReservation usage from imagegen/videogen - batch-concurrency config key - 'Parallel image / video' setting input - Random suffix on default output paths WalletReservation infrastructure stays (Modal tools use it). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Brings in 120 commits since the last sync (1106ef5), including: Vision (this is the big-ticket fix): - PR #53 + sibling-sites patch: preserve image blocks in budgetToolResults / ageOldToolResults / dedup; client-side sharp resize on Read (1.9MB PNG -> 117KB). LLM / gateway: - Gemini Pro non-streaming requests - 429 Retry-After honoring - Stream char sanitizer (U+2502 / U+2500 -> ASCII) - Gateway error text doesn't kill session - Classifier separates payment_rejected from payment_required Stats / cost: - franklin stats reads cost_log.jsonl (SDK ledger) - recorded-vs-wallet gap detection - image/video/modal latency measured at 5 callsites - agent-loop measures real LLM latency (was hardcoded 0) Loop / agent: - same-tool warn-once + signature-based stuck detector - switch model when intent declared without tool_use - --resume preserves cost / token totals Wallet / Swap: - Base0xGaslessSwap (user pays no ETH for gas) - Base 0x V2 + Permit2 - Jupiter Ultra swap with on-chain referral fee Prediction Market: full rewrite of wallet-analysis triplet, smartMoney replacement, walletProfile addresses fix. Trading: TickerToId expansion (TON etc), dual-listing notice for tokenized equities, live-swap session cap. Modal: latency tracking, logger migration. ImageGen: HTTP 202 queue handling, latency, error surfacing. Conflict resolution: - tools/modal.ts, tools/index.ts, tasks/spawn.ts, session/storage.ts, agent/tool-guard.ts: take main. - session/storage.ts: ported v0.5's deleteSession + renameSession + SessionMeta.title back on top of main. - tools/imagegen.ts: hand-merged. Kept v0.5's broader async detection (handles HTTP 202 + status fields + non-JSON body fallback), kept main's bits-based error surfacing + exported pollImageJob singleton, deduped poll helpers. - tools/videogen.ts: re-added missing videoGenCapability singleton export so test fixtures keep importing it. Tests: 368/368 pass. Build clean.

Bumps the VS Code extension to 0.6.1 and rebundles out/extension.cjs on top of the v0.5↔main merge. The user-visible win is the vision fix: image paste / drop / file Read no longer over-charges ($0.50/call → bounded by client-side sharp resize) and no longer hallucinates descriptions (image blocks now survive the optimizer pipeline end to end). The bundled franklin core jumps from v3.10.x territory to v3.15.90, picking up the 120 commits enumerated in the merge message — the extension inherits all of them with no extension-side code change required (UI surfaces of the new prediction / Base / Modal / etc. tools land automatically through the agent's tool inventory).

…image guard Brings in d370a38 + 5003b67. The two patches do exactly what we were about to design ourselves (and did some research on, comparing opencode / Aider / Continue / OpenHands / Cline patterns): - New src/router/vision.ts: curated vision-model allowlist, basename-anchored image-path regex, family-aware sibling picker. - routeRequest / routeRequestAsync / resolveTierToModel take a new needsVision flag. Auto routing now walks the tier chain for the first vision-capable model when an image is in play; escalates to COMPLEX (Opus) if the whole tier is text-only. - Manual-mode guard in agent/loop.ts: detects image refs in user input on turn 1, swaps the user's text-only pick to the closest family vision sibling for ONE turn with a visible warning. Next turn's baseModel recovery restores the user's pick. - proxy/server.ts mirrors the same logic on the Anthropic proxy path (scans messages[] for image / image_url / input_image parts plus paths in text parts). - 5 new tests; 373/373 pass total. Better than the design we discussed: their swap-with-warning single-turn approach beats the silent-strip pattern that opencode / Continue / OpenHands all use, by avoiding the "user can't tell what model is running" failure mode of silent model substitution.

…ector Brings in d8803cd + 4ddf2f1. Adds: - Per-(tool, category) classification of tool failures - Anomaly detector that flags tools with above-baseline failure rates - 'franklin doctor --anomaly' surfaces the report Conflict: package-lock.json (regenerated via npm install). Tests: 381/381 pass.

Pairs with BlockRunAI/blockrun rfc/modal-full-chain (gateway PR). New capabilities: - ModalDeployFunction: register a long-running Python function on Modal (custom pip deps, GPU choice, up to 24h timeout). Charges max_timeout × hourly rate upfront — same model as long-task sandbox. - ModalRunFunction: trigger a deployed function. Returns run_id; poll for result. Compute already paid at deploy. - ModalGetFunctionStatus: poll a run for status/result/error. - ModalCreateVolume: create persistent storage. \$0.20/GB-month, 1mo prepaid. Up to 200GB per wallet. - ModalListVolumes: list caller's volumes. - ModalDeleteVolume: delete a volume (no refund). These close the gap between the 24h-capped Sandbox path and the long- running ML workflows agents need (fine-tuning, batch jobs, persistent checkpoints). Smart-rebate / actual-usage settlement is Phase B (documented separately in the gateway team's Notion checklist) — v1 charges upfront and does not refund early-finish. Wire-level design: see RFC in BlockRunAI/blockrun (rfc/modal-full-chain). Gateway must be deployed first; this client PR is no-op until then.

Franklin's main hadn't run CI since 2026-04-21; some package.json change landed without an accompanying lockfile bump, so 'npm ci' fails on: npm error Missing: utf-8-validate@5.0.10 from lock file Regenerated cleanly via 'rm -rf node_modules package-lock.json && npm install'. Lockfile is now in sync with current package.json. This commit is unrelated to the Modal capabilities being added in this PR — included solely to unblock CI on this branch (and incidentally on main too).

KillerQueen-Z and others added 12 commits April 30, 2026 02:03

Merge remote-tracking branch 'origin/main' into feature/vscode-extens…

16f67f7

…ion-v0.5 # Conflicts: # src/panel/html.ts # src/stats/tracker.ts

Merge remote-tracking branch 'origin/main' into feature/vscode-extens…

1106ef5

…ion-v0.5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tools/modal): add Function + Volume capabilities#55

feat(tools/modal): add Function + Volume capabilities#55
KillerQueen-Z wants to merge 12 commits into
mainfrom
feat/modal-functions-volumes

KillerQueen-Z commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KillerQueen-Z commented May 12, 2026

Summary

New capabilities

Use case

Pricing

Test plan

No breaking changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant